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ABSTRACT 

Most of the networks observed in real life obey power-law 
degree distribution. It is hypothesized that the emergence 
of such a degree distribution is due to preferential attach- 
ment of the nodes. Barabasi- Albert model is a generative 
procedure that uses preferential attachment based on de- 
gree and one can use this model to generate networks with 
power-law degree distribution. In this model, the network is 
assumed to grow one node every time step. After the evo- 
lution of such a network, it is impossible for one to predict 
the exact order of node arrivals. We present in this arti- 
cle, a novel strategy to partially predict the order of node 
arrivals in such an evolved network. We show that our pro- 
posed method outperforms other centrality measure based 
approaches. We bin the nodes and predict the order of node 
arrivals between the bins with an accuracy of above 80%. 

Keywords 
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dering, node aging 

1. INTRODUCTION 

Real world networks such as biological, social and tech- 
nological networks are the products of an evolutionary pro- 
cess. These networks are generally classified as Scale Free 
Networks (SFN) by nature. SFNs are a class of networks 
in which degree distribution follows Power Law. Generative 
models such as Duplicate-Mutation, Forest Fire and Pref- 
erential Attachment [T] have been proposed to synthesize 
SFNs. The synthesis of dynamic SFNs involves a continuous 
addition of new nodes to the existing network. The behav- 
ior of each new node depends on the generative model being 
used. It is interesting to study how nodes get assembled in 
complex network over time [lO]. Given the snapshot of a 
dynamic network, is it possible to probabilistically predict 



the evolutionary sequence of the nodes in the network? 

We propose a method that predicts the order of arrival of 
nodes in the given Scale-Free Network, modeled and syn- 
thesized using a specified generative model. This approach 
first computes a vertex ranking of the given network based 
on a ranking methodology. We then synthesize several such 
networks using the generative model that was used in the 
construction of the given network. It is important to note 
that the order of arrival of nodes in the synthesized net- 
works is known. The same ranking methodology is applied 
to compute the vertex ranking for each of the synthesized 
networks. The nodes in the given network are mapped to 
the nodes in a synthesized network, according to a bijec- 
tion function between the vertex rankings. We then predict 
the probable order of arrival of nodes in the given network, 
based on the bijective mapping and the order of arrival of 
nodes in the synthesized network. This method of mapping, 
over several such synthesized networks, associates a proba- 
bility with every pair of vertices. This probability denotes 
the arrival order of vertices in the corresponding vertex pair. 

We then construct a Directed Graph (DG) by drawing an 
edge for every pair in their predicted order of arrival. We 
propose a binning methodology, wherein the nodes of the DG 
having similar characteristics are grouped into hypothetical 
containers called bins. The order of arrival of nodes within 
a bin is unknown. Hence, we determine the order of arrival 
of nodes across several such bins. 

2. PRELIMINARIES AND NOTATIONS 
2.1 Scale Free Networks 

A Scale-Free Network (SFN) is a network whose degree 
distribution follows a power law. Many real world networks 
are known to exhibit a decaying degree distribution. This 
kind of distribution is called a power law. Mathematically, 
it is defined as 

P(k) ^ ck-^ I (1) 

where, 

k is degree, 

c is a normalization constant and 

7 is a parameter whose value is typically in the range (2,3) 



The high degree nodes in a SFN are often cahed as "hubs". 
The power law degree distribution of the SFNs suggests the 
existence of a smah number of high degree nodes. Although 
the hubs are small in number, they dominate the network 
to a great extent. Removal of the hubs from the network 
might cause a network breakdown and disrupt the network 
characteristics. Figure 1 shows an example of a SFN. The 
degree distribution of the same network is shown in Figure 
2. 







Figure 1: A Scale- Free Network of 200 nodes. 




Figure 2: The degree distribution curve for the net- 
work in Figure 1. This network follows a power law 
degree distribution. 



2.7.7 Generative Model for Scale Free Networks 

To explain the power law degree distribution in the 
real world networks, mechanisms such as preferential attach- 
ment and fitness model, etc.. have been proposed. Barabasi 
and Albert proposed a randomized algorithm for generat- 
ing SFNs using a preferential attachment mechanism. This 
model is referred to as BA model 



Algorithm to construct a BA Network GiVfinai^C) : 

Let C be the number of connections that each new node 
must create on its arrival. Let Vfinai be the vertex set of the 
completely generated network G. It is clear that |V7inaz| > 
C. As the network evolves, let V and E be the instanta- 
neous vertex set and edge set of the intermediate networks 
respectively. 

The nodes are designated by enumerating them as 
{0,l,2,...,(|Vf,w|-l)}. 

A complete network Kc with C nodes is constructed. 
Now, \V\ = C. 

while \V\ < \Vf^r^al\ do 

Generate a new node u. 

Preferential Attachment: Let v ^ V he sampled ac- 
cording to the cumulative degree distribution function, 
CDF{i). 



CDF{2) = Yl 



degree{Nj) 
2 * totaFedges 



where Nj G V 



(2) 



iter ^ 1 

while iter < C do 

Let r be a real number uniformly picked at random 
in [0,1). 

Choose ueV \ CDF{u - 1) < r < CDF{u). 
if (u, v) ^ E then 

append (u, v) to E 
else 

iter ^ iter — 1 
end if 

iter ^ iter + 1 
end while 
end while 

Figure 3 illustrates the growth of a BA Network G{9, 3). 









Figure 3: Growth of a BA Network with 9 nodes 
and 3 connections. 



2.2 Directed Acyclic Graph 

A Directed Acyclic Graph (DAG) is a directed graph 
containing no cycles. Indegree of a node v in a directed 
graph G is defined as IS*! : 5* ^ {{u^v)\{u^v) G Eq}- It 
is denoted by InDegree{v). Outdegree of a node v in a 
directed graph is defined as \S\ : S ^ {{v,u)\{v,u) G Eg}- 
It is denoted by OutDegree{v). 

2.3 Lists and Index of an element 

A list is an ordered set of elements. Index of an element 
u in a, list L is the position at which the element u occurs in 
L, denoted by index l{u). 

2.4 Centrality Measures 

A centrality measure is a function that associates a real 
value with each vertex in a network 6 . The value indicates 
how central or important the vertex is, in the network. Here, 
the term "important" is application specific. This gives rise 
to many centrality measures, each of which rates the nodes 
according to some property of the node. 

2.4.1 Degree Centrality 

Degree of a node is often interpreted as an effective mea- 
sure of influence or importance of that node in a network. 
Degree of a node uin a graph in denoted by deg{u) 4 . The 
Degree Centrality assigns a node u with a value that is pro- 
portional to deg(u). 
Mathematically, for a graph GiV^E): 

Q.,_W = ^^ v^V (3) 

2.4.2 Betweenness Centrality 

Betweenness Centrality assigns a node v with a value 
that is proportional to the number of shortest paths [5] [9| , 
between all other pairs of vertices, that pass through v. 

Let 6(v) denote the fraction of shortest paths between s and 
t that contain the vertex v. 

5st{v)^^^ (4) 

where Gst denotes number of all shortest paths from ver- 
tex s to t and ast{v) denotes the number of shortest paths 
from s to t passing through v. Then the Betweenness Cen- 
trality of a vertex v is given by 

Cbetweenness{v) — ^ ^ ^st{v) (5) 

In our experiments, we have used Brandes approach to com- 
pute betweenness centrality [tJ. 

2.4.3 Eigenvector Centrality 

The index in Eigenvector Centrality characterizes the 
individuals in connected networks according to their level 
of popularity ^ [s]. It is a more sophisticated version of 
Degree Centrality. A given node is said to be popular if it 
is connected to many other nodes or few nodes with a very 
high popularity. Mathematically, this can be formulated as 
follows: 

Let A be the adjacency matrix of the network G{V,E). 
Au,v = 1 if {u,v) G Eg and Au,v = if {u,v) ^ Eg- Let Xu 



denote the centrality score oi u ^ Vg- Xu is proportional to 
the sum of the scores of neighbor s{u). Hence 

1 

Xu ^ ^ Aii^V Xy (6) 

V — 1 

where A is a constant. 



On defining x = [xo xi X2 ... x\v\-i] as a vector of centrality 
scores, we can transform the above equation into a matrix 
form as 

X \ax (7) 
A 

Assuming that we wish the centrality scores to be a non- 
negative real value, it can be shown (using the Perron- Frobenius 
theorem) that A must be the largest Eigen Value of A. x is 
the Eigen Vector corresponding to the Eigen Value A. 



2.5 Reference Network 

In our experiments, we study the SFNs generated using 
the Barabasi- Albert Model. Let Gm{Vm,Cm) represent a 
Barabasi- Albert Network whose vertex arrival order is to be 
deduced. For evaluative purposes, we record the order of ar- 
rival of vertices in Gm during its inception. Let listtme be a 
sequence of vertices that represent the actual order of arrival 
of vertices in Gm. We will be referring to Gm{Vm,Cm) in 
all the further sections as the input network to the proposed 
algorithm that predicts order of arrival of nodes. 

3. CENTRALITY MEASURE BASED 

METHODS 
3.1 Degree Binning 

The degree of a node is the number of connections associ- 
ated with that node. A naive approach towards the solution 
to the vertex arrival order prediction problem is to exploit 
and explore the contribution of this factor. 

From the preferential model of SEN construction, it is evi- 
dent that the last few nodes that get connected to the net- 
work will have a relatively low degree, as compared to the 
nodes that had arrived in the initial stages. Consider the 
network Gm from section 2.5. Intuitively, we hypothesize 
that higher the degree of a node, higher is its influence in 
the network, and earlier it has arrived during the network 
evolution. We can state with a high probability, that the 
notable hubs in Gm would have arrived prior to the nodes 
with a relatively low degree. 

Hence, we rank the nodes in the decreasing order of their 
degree. The equi-degree nodes are assigned with the same 
ranking. We then place the vertices with the same rank- 
ing into a hypothetical container, referred to as a bin. The 
ranking of a bin is same as the ranking of node(s) inside 
the bin. The number of bins formed is the total number of 
unique ranks assigned to the nodes. We then apply a Bin- 
ning Quality Measure (BQM) to compute the accuracy of 
our prediction of order of arrival of nodes across the bins. 
BQM quantifies the prediction accuracy on a scale of to 1. 
Figure 5 illustrates the Binning Methodology that we use to 
predict the order of arrival of nodes across the bins. 




Figure 4: A SFN, constructed using BA model with 
9 nodes and 3 connections. 



Figure 5: Binning the nodes of the network in figure 
4 based on degree. The numbers below the bins 
denote the degree of the nodes that are present in 
the bin. 



The following mathematical formulation illustrates a tech- 
nique to quantify the correctness of our prediction. We refer 
to the technique as Binning Quality Measure (BQM). 
Let S be the number of bins. Let B = [Bq, Bi, B2, Bs] 
be the predicted chronological bin ordering. We associate 
a score /3 between every pair of bins. The final prediction 
measure 77 is computed as a ratio of sum of /3 for all bin-pairs 
and the total number of bin-pairs. 



To calculate /3 for a pair of bins Bi and Bj, with i < j: 
Here, we claim that the nodes in Bi has arrived before the 
nodes in Bj Hence, we impose the condition i < j, with 
reference to the predicted chronological bin ordering B. 
For a pair of vertices u G Bi and v E Bj, we define 



vertexOrder{u, v) = I if indexusttruei'^) < iridexusttruei'^) 



vertexOrder{u,v) = if index usttruA^) > indexusttr^^iv) 



^u^B- v^B ■ vertexOrder{u,v) 

^(^'^') = - I lis, I 



The final prediction measure 77 is given by 

Eo<^<,<<5/^(^'i) 



3.2 Binning based on Centrality Meaures 

The main drawback of binning based on degree is that, 
the degree centrality indices associated with the nodes are 
not distinct in Gm- This is because there can exist many 
number of nodes with the same degree. Hence, binning 
based on degree centrality results in a small number of bins, 
with a large number of nodes per bin. Ideally, it is desirable 
to have more number of bins with a less number of nodes 
per bin. 

We move on to yet another approach which could provide 
us with a large number of bins. In this approach, we apply 
X centrality to main graph. Based on an intuitive conjec- 
ture, higher the x centrality a node, earlier it has arrived 
in the network evolution. Hence, we sort the vertices in the 
decreasing order of their x centrality indices. We group the 
nodes from this sorted ordering into 5 number of bins, each 



bin containing 



number of nodes. We refer to the list 



(8) 



of bins thus obtained as binOrdering^. In our experiments, 
we choose x be Betweenness Centrality and Eigenvector 
Centrality. We use BQM (refer section 3.1) to quantify the 
accuracy of the prediction using binning based on centrality. 

4. A NEW VERTEX RANKING: 

DIFFERENTIAL CORE RANKING 

In this section, we formulate a new method of ranking 
nodes. Let GiV^ E) be any graph. Let DCRg represent the 
Differential Core Ranking of G. 

Let X be any centrality measure. Let Gq be the initial graph. 
Let Gi be the graph obtained from Gq after removal of nodes 
with the minimum degree. The change in x centrality value 
of the nodes in Gq is set as the attribute of the corresponding 
node. We then apply the above procedure starting with Gi. 
Let G2 be the graph obtained from Gi after the removal 
of nodes with the minimum degree. The change in the x 
centrality value of the nodes in Gi is added to the attribute 
of the corresponding node. 

In general, let C^+i be the graph obtained from Gi after 
the removal of nodes with the minimum degree. The change 
in the x centrality value of the nodes in Gi is added to 
the attribute of the corresponding node. This procedure is 
repeated until there are no nodes left in Gi. 

The algorithm to compute DCRg is as follows: 
Let X represent any centrality measures 
Let Gq represent the given graph G 

Let u G V{G). Let the Differential Core Measure DCMu 
be a value associated with u. 
Set DCMu = Vi^ G V{G) 

Let Xu,Gk represent the x centrality value of u. 
Let i ^ 

while I^gJ > do 

Let minDeg ^ argmin{deg{u)),u G V{Gi) 

Let minVertices ^ {uo,ui....Un}, deg{um) = minDeg 

Let Gi-^i ^ graphobtained after removing 

minVertices from d 

DCMu ^ DCMu + abs{xu,G,+, - Xu,g,) G V{G^) 

and u G V{Gi+i) 
DCMu ^ DCMu + abs{0-Xu,Gi) G V{Gi) andu^ 



end while 

DCRg ^ {{DCMuo,uo),{DCMu,,ui)...,{DCMu^^^^,u\Va\)} 

DCRg gives the Differential Core Ranking of the vertices. 
DCMu denotes the centrahty score of the node u. Higher 
the sum of changes in the x centrahty values of a node, 
higher is its importance in the network. 

5. NETWORK RECONSTRUCTION 
ALGORITHM 

In this section of the paper, we describe our algorithm 
to predict the order of arrival of nodes in Gm- 

Our Algorithm is mainly divided into 4 subsections. Section 
5.1 aims at generation of Synthetic Networks that resem- 
ble Gm- Section 5.2 describes a mapping procedure and 
derivation of prediction lists. In section 5.3, we analyze the 
prediction list and construct a directed graph. Section 5.4 
deals with the transformation of directed graph to a directed 
acyclic graph and binning of nodes. 

5.1 Generation of Synthetic Networks 

The main focus of this section of the algorithm is to 
recreate the growth environment of the reference network 
Gm- Since the exact replication of Gm is not possible, we 
generate networks that are similar to Gm in certain char- 
acteristics. We refer to these set of networks as Synthetic 
Networks. 

Let a be the number of Synthetic Networks generated. Let 
Si and chronology Si denote the Synthetic Network and the 
order of arrival of nodes in the corresponding Si. In our 
experiments, we use BA model to generate Si^ with \Vm\ 
number of nodes and Cm connections. It is worth noting 
that every time we generate a Synthetic Network 5'i, we 
keep track of the network growth by recording chronology - 
Since the Synthetic Networks are built on the same model 
as that of Gm, we hypothesize that the chronology of Si is 
similar to the actual order of arrival of nodes in Gm- Hence, 
it is righteous to make use of chronology in predicting the 
probable order of arrival of nodes in Gm- 

5.2 Mapping and Derivation of Prediction Lists 

We have now generated a number of BA Synthetic Net- 
works that is similar to Gm in terms of the number of vertices 
\Vm\ and connections Gm- The chronology of the Synthetic 
Networks Si^ where 1 < z < a, is known. In this section, we 
intend to derive an ordering of nodes in V^, corresponding 
to each Si. This ordering of nodes is the predicted order 
of arrival of nodes in Gm (during its inception), derived in 
accordance with chronology s^- We refer the node ordering 
corresponding to Si as PredListi. The procedure that we 
follow to deduce PredLisU is explained in the remainder of 
the section. 

We apply DCR, with x as the base centrahty measure (Refer 
to section 2.4), to Gm in order to obtain DGRcm- DGRcm 
is a list of vertex rankings sorted according to their DCM 
values. (Refer to section 4) 



Consider a Synthetic Network Si. We apply DCR, with x 
Centrahty as the base centrahty measure, to Si in order to 
obtain DGRs^ - 

Both DGRcm DGRsi ^i^^^ ^^e vertices of Gm and Si 
respectively in the decreasing of their importance. Lower 
the position of a vertex in these lists, higher its importance 
in the corresponding network. A direct bijection mapping 
is carried out between DGRcm DGRs^ - This mapping 
maps the equi-important vertices in both the networks. 

Mathematically, we define a mapping function as: 

Let fmap '- ^ dircct bijection between Vs^ and 

i-G, fmap{u) = V whcrc u G Vsi,v G and index m{u) = 
index n{v) 

We propose that the nodes of equal importance in Gm and 
Si have the same chronological ranking. Since we know 
chronology Si , we deduce PredListSi by replacing each ver- 
tex u in chronology Si with fmap(u). 

We repeat the above procedure for each Si. At this stage, 
we have a prediction lists, denoted by PredListi^ each cor- 
responding to a particular Si. 

Algorithm for Mapping: 

Input: The Reference Network Gm and Synthetic Net- 
works {5*1, 5*2, ...aSc} 
Output: a Prediction Lists 

Apply DCR, with x as the base centrahty measure, to Gm 
ljelUieVm--l<i< \Vm\ 

Let DGRcm ('^0 denote the DCR associated with the ver- 
tex Ui 

Let the tuple list M ^ { {DGRcm {ui),ui), {DGRcm (^2 ) , ^2 ) , 

...{DGRcm (^ivwi).^iv^l)} 

Sort M in the descending order of DGRcmi^i) 
for all i = 1 to a do 

Let Vj eVsi:l<j < \Vs,\ 

Let {vi,V2, ...v\Vs.\) denote chronologysi 

Apply DCR, with x centrahty as the base centrahty, to 

the Synthetic Network Si 

Let DGRsiivj) denote the DCR of the vertex Vj 
Let the tuple hst N ^ { {DGRsi (^1 ) . ^1 ) . {DGRs^ (^2 ) , ^2 ) , 

...{DGRsi{v\v,^\),v\v^^\)} 

Sort N in the descending order of DGRsi{vj) 

Let fmap '- Vs- ^c a bijection between Vs- and 

fmap(u) = V where u G Vsi,v G and index m{u) = 
index n{v) 

PredListi ^ {fmap{vi), fmap{v2), --fmap{v\Vs.\))^ 
emd for 

Figures [6 to 9] illustrate an instance of Mapping of nodes 
between Gm and any Si : 1 < i < a. Figure 10 illustrates the 
derivation of prediction list PredListi using chronology Si - 




Figure 6: Applying Differential Core Ranking, with 
Betweenness Centrality as the base centrality, to 




Figure 7: Applying Differential Core Ranking, with 
Betweenness Centrality as the base centrality, to one 
of the Si \ 1 < i < a. 
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Figure 9: Direct bijection mapping of vertices be- 
tween Lists in figure 8. 
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Figure 10: Deduction of PredLisU by reordering the 
nodes of Vm according to chronology . 

5.3 Analysis of Prediction Lists and Construc- 
tion of Directed Graph 

In the previous section, we have deduced a number of 
Prediction Lists, PredLisU : 1 < i < a. For every pair of 
vertices {u,v) : u,v ^ yCmi we find the order of occurrence 
of u and v in each PredLisU. Let P{u,v) denote the prob- 
abiUty of u arriving before v during the inception of Gm- 
We compute P{u,v) as the fraction of the number of times 
u has occurred before v in the a Prediction Lists. By intu- 
itive reasoning, it is not hard to infer that, if P{u,v) < 0.5, 
then V has probably arrived before u during the inception of 
Gm- Hence, we set P(v,u) = 1 — P{u,v)- We then construct 
a Directed Graph DG with vertex set Vdg = Kn, and edge 
set Edg = 0. a directed edge from it to in DG indicates 
that u has arrived before v during the construction of Gm- 
For a pair of vertices (u, v): 

if P{u,v) > 0-5, then we say that u has arrived before v with 
a probabihty P(u,v) 

if P(u,v) < 0-5, then we say that v has arrived before u with 
a probabihty 1 — P(u,v) 



Figure 8: Vertex ordering based on decreasing Dif- 
ferential Core Ranking for Vcm and Vs-. 



The algorithm to deduce DG is presented below: 



their order of arrival. 



Let S ^ {Si, S2, S3, ...Sa} denote the set of Synthetic 
Networks 

Let PredListi denote the Prediction List corresponding 
to Si : 1 < i < a (Refer to algorithm in section 5.2) 
Construct a Directed Graph DGm with Vdg^ = Vm and 

EoGm = 

Let P(u,v) be the probability associated with {u, v) : u,v ^ 
VDGm determining if u has come before v. 
for all unordered pairs {u, v) : u,v ^ Vm and u ^ v do 
count ^ 
for i ^ 1 to a do 

if index s^iu) < index s^iv) then 

count ^ count + 1 
end if 
end for 

P{u,v) ^ count /a 
if P(u,v) > 0-5 then 

append {u, v) to EDGm with a weight P(u,v) 
else 

append {v, u) to EDGm with a weight 1 — P{u,v) 
end if 
end for 

In the next section, we analyze DG to obtain final predicted 
order of arrival of nodes in Vb^ • 

5.4 Transformation of Directed Graph and Node 
Binning 

In this section, we process DG obtained from the previ- 
ous section to deduce the final prediction of order of arrival 
of nodes in Gm- Ideally we expect DG to be acyclic in 
nature, as cycles would give rise to inconsistent prediction 
order among the nodes involved in the cycle. For example, 
lets say, {u, v) and {v, w) are in Edg- This implies that u has 
arrived before v and v has arrived before w. Hence, w must 
have arrived before u. U {w,u) also an edge, then it leads to 
a contradiction in the chronological ordering of u, v and w. 
Since there is a fair possibility that DG can be a cyclic graph, 
we intend to transform it into a Directed Acyclic Graph 
(DAG) and remove the inconsistencies involved. In our al- 
gorithm, we use a greedy technique to achieve the above. 

The algorithm to transform DG into DAG is presented 
below: 

Input: Directed Graph DG. 

Output: Directed Acyclic Graph DAG. 

while DG contains cycles do 

Remove the edge {u, v) with the least P(u,v) • {u, v) G 

Edg- 
end while 

The DAG thus obtained is free from inconsistencies. 
InDegree{v) represents the number of nodes that have been 
predicted to arrive after the arrival of v. Ideally, the node 
that had arrived earliest should have zero InDegree. The 
next earliest node should have an InDegree equal to 1 and 
so on. Since we are probabilistically simulating the growth 
environment of Gm, it is practically not always possible for 
the nodes to have the same sequence of InDegree as that of 



As the last step of the algorithm, we carry out the node 
binning process. We find all the vertices v G Vdag hav- 
ing the least InDegree{v) and group them into a bin Bi. 
The binned vertices are then removed from DAG. We then 
repeat this step iteratively until there are no nodes left in 
DAG. At each each iterative step i, we bin the nodes into a 
bin Bi. By the ordering the bins according to their indices, 
we get the final predicted bin ordering. 

Algorithm to bin the nodes from DAG is presented below: 
Input: Directed Acyclic Graph DAG 
Output: Bin Ordering 
count ^ 1 

while I Vdag I ^ do 

minlnDeg arg min{InDegree(u)) where u G Vdag 

Let Bcount ^ {u : ^u e Vdag 

and InDegree(u) = minlnDeg} 
Remove all the nodes in Bcount from Vdag 

i.e, Vdag ^ Vdag — Bcount 

Count ^ Count + 1 
end while 

Let binOrdering ^ ^2, ^3, ...Bcount] 

binOrdering gives the predicted chronological sequence of 
bins. The order of arrival of nodes within a bin is unknown. 
But the order of arrival of nodes across several such bins 
can be determined. The accuracy of this prediction, in con- 
trast with accuracy of prediction using centrality measures, 
is discussed in the next section. 

6. RESULTS AND DISCUSSIONS 

6.1 Comparison between the predictions from 

Differential Core Ranking and 

Plain Centrality 

Centrality Index of a vertex in a network indicates its 
relative importance in the network (refer section 2.4). Let x 
be a base centrality measure. We hypothesize that, higher 
the relative importance of a vertex in a network Gm , earlier 
it has arrived during its evolution. Hence, the vertices in 
the network are arranged in the descending order of their x 
centrality indices. Let this ordering of the nodes be denoted 
by PlainxGm- We apply DCR (refer section 4), with the 
same centrality x the base centrality, to the network Cm • 
The vertices in the network are arranged in the descending 
order of their DCR values. Let this ordering of the nodes be 
denoted by DifferentialxGm • 

For experimental purposes, the actual order of arrival of 
nodes in Gm is recorded. It is denoted by listtme- Let 
the predicted order be denoted by listpred- To compute the 
accuracy of our prediction, we define a new quality measure 
called rj(listtrue, listpred) ■ 



r}{listtrue, listpred) = ^ (9) 

where nc is the number of pairs in listpred that are in correct 
relative order with respect to listtrue- To compare the pre- 



diction accuracy for the lists PlainxGm ^^nd Dif ferentialxc 
we just compare the values of r] {list true, PlainxGm) 
rj{listtrue, Dif f erentialxGm) • our experiments we con- 
sider the cases where x represents Degree Centrality, Be- 
tweenness Centrality and Eigenvector Centrality. The fol- 
lowing figures represent the plots used to compare the values 

of r]{listtrue, PlainxGm) ^ind ri{listtrue, Dif ferentialxGm) 
for varying number of nodes. Note that the number of con- 
nections Cm is kept constant. 
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Figure 11: Comparison of Differential Core Rank- 
ing (Red line), with Betweenness as the base cen- 
trality measure, and Plain Betweenness Centrality 
(Blue line) for the BA Networks with 3 connections. 
The X-axis represents the number of nodes. The 
y-axis denotes r](listtrue, Dif ferentialBetweennessGrn) 
and r]{listtrue, PlainBetweennessGm)- 
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Figure 12: Comparison of Differential Core Rank- 
ing (Red line), with Degree as the base central- 
ity measure, and Plain Degree Centrality (Blue 
line) for the BA Networks with 3 connections. 
The X-axis represents the number of nodes. The 
y-axis denotes ri{listtruej Dif ferentialDegreeGm) and 
r](listtrue, PlainDegreeGrr,)' 




Figure 13: Comparison of Differential Core Rank- 
ing (Red line), with Eigenvector as the base cen- 
trality measure, and Plain Eigenvector Centrality 
(Blue line) for the BA Networks with 3 connec- 
tions. The X-axis represents the number of nodes. 
The y-axis denotes ri{listtrue, Dif ferentialEigenGm) 
and irj{listtrue, PlainEigeriGm)' 
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Figure 14: Comparing Differential Core Ranking 
(Red line), with betweenness as base the cen- 
trality measure, and Plain Betweenness Centrality 
(Blue line) for the BA Networks with 1000 nodes. 
The X-axis represents the connections Cm. The 
y-axis denotes rfilisttrue, Dif ferential Betweenness Gm) 
and r]{listtrue, PlainBetweennessGm)' 
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Figure 15: Comparison of Differential Core Rank- 
ing (Red line), with Degree as the base central- 
ity measure, and Plain Degree Centrality (Blue 
line) for the BA Networks with 1000 nodes. 
The X-axis represents the connections Cm. The 
y-axis denotes ri{listtrue, Dif ferentialDegreeGm) and 
r]{listtrue, PlainDegreeGm) ' 
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Figure 16: Comparison of Differential Core Rank- 
ing (Red line), with Eigenvector as the base cen- 
trality measure, and Plain Eigenvector Central- 
ity (Blue line) for the BA Networks with 1000 
nodes. The x-axis represents the connections Cm* 
The y-axis denotes r]{listtrue, Dif ferentialEigenGm) 
and r]{listtrue, PlainEigenCm)' 

Figures 11-13 illustrate the performance of our alogrithm 
in comparison with the centrality based binning, for varying 
number of nodes in Gm- Figures 14-16 illustrate the perfor- 
mance of our algorithm in comparision with centrality based 
binning, for varying connection Cm in Gm- 

6.2 Prediction of arrival order in every node 
pair with an attached probability 

The outcome of section 5.3 is a weighted directed graph 
DG. We have associated a probability P{u,v) with every di- 
rected edge {u,v) G Eg- P(u,v) indicates the probability 
with which u has arrived before v. From the construction 
mechanism of DG, it is clear that P{u,v) > 0-5. Closer the 
value of P(u,v) to 0.5, harder it is to ascertain the chronolog- 
ical ordering of u and v. Note that there is a fair possibility 
that DG can contain cycles. We claim that the inconsis- 
tencies in the prediction might be caused due to edges with 
P(u,v) close to 0.5. This may lead to a formation of cycles. 

We now present the analytical results that we have obtained, 
considering Gm as reference network. We have generated 
Gm using a BA model with 1000 nodes and 3 connections. 
We generate 50 synthetic networks. So, we set a = 50. The 
analytical results thus obtained is given below: 



ordering, they contribute to the cycle formation. Cycles 
introduce inconsistencies in node arrival order, hence they 
have to be removed. From our experiments, we have found 
out that DG will become acyclic when we remove the edges 
{u,v) continually in the increasing order until P(u,v) ~ 0.6. 
We implement the same technique in section 5.4 to trans- 
form DG to DAG. 



Based on the facts and figures from the table, we observe 
that the fraction of pairs that are in correct relative order 
with listtrue increases as the sampled range increases. Hence 
we conclude that, higher P(u,v) implies a stronger notion of 
relative ordering of {u,v). 

6.3 Comparison between the predictions from 
DCR binning and Plain Centrality binning 

The end result of our method (section 5.4) is the ordering 
of the bins, referred to as binOrderingDCRx- Let A be the 
number of bins in binOrderingocRx' Let t/dcrx denote the 
BQM value of binOrderingncRx^ where x refers to the base 
centrality measure for DCR. 

We derive the binOrdering^ (refer section 3.2) with A num- 
ber of bins, and x indicating the centrality measures. Let 
binOrderingbetweenness, binOrderingeigen and binOrderingdeg 
denote the chronology of bins with x set as Betweenness, 
Eigenvector and Degree Centralities respectively. 

Let rjbetweenness, Tjeigen and rjdegree dcUOtC the BQM ValuC of 

binOrderingbetweenness, binOrderingeigen and binOrderingdeg 

respectively. Finally, we compare rjbetweenness, Veigen, rjdegree 

and r/DCRx where x is the base centrality (refer section 4). 
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(0.5, 0.6] 


0.216606606607 


0.546827487407 


(0.6, 0.7] 


0.156592592593 


0.652739778568 


(0.7, 0.8] 


0.137975975976 


0.767770861446 


(0.8, 0.9] 
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Figure 17: The plot denotes the BQIM score for var- 
ious binning methodologies for the reference graph 
Gm of 1000 nodes and 3 connections. In our ex- 
periment, we have set a = 50. The results we ob- 
tained are as follows: rjocR^e ree — 0.804513946531 



Vde 
Veil 



0.767615011251 ry^, 



'etweenness 



0.759827243464 



= 0.695466553648 number of bins=91 



Statistically, from the above table, we observe that the edges 
{u, v) having P{u,v) in (0.5, 0.6] constitute around 20% of the 
edges. We also note that only around 50% of these edges are 
in the correct relative order with listtrue- Since a large frac- 
tion of edges belonging to this range are in incorrect relative 



Predicted chronological sequence of bins obtained from DCR 
For any x base centrality, we observe that it is more ac- 
curate compared to any other centrality based approaches. 



Figure 18: The plot denotes the BQM score for var- 
ious binning methodologies for the reference graph 
Gm of 1000 nodes and 3 connections. In our ex- 
periment, we have set a — 50. The results we ob- 
tained are as follows: VDCRijetweenness ~ 0.87153926121 

megree = 0.8251012352 r]betL7nZs7^ = 0.8158246115 

ryeigen = 0.7823167778 number of bins=63 



7. CONCLUSIONS 

We presented a novel framework for uncovering the pre- 
cursor of a SFN evolved by preferential attachment model. 
Our approach involves synthesis of many such SFNs, map- 
ping these SFNs with the reference network based on DCR 
score associated with the nodes and arriving at the final pre- 
dicted order. We presented results based on a novel indexing 
method called the differential core ranking, which proved to 
provide better node arrival prediction than the approaches 
based on standard centrality measures. 

Our approach can be put to practice in situations where one 
is given a real world network (which is known to have evolved 
by preferential attachment) and one is interested to obtain 
the order of node arrivals. A useful application would be to 
unravel the age of the links in www network, which is known 
to be scale-free 1 . Also, knowing the age of the nodes in 
a disease spreading network would help us determine the 
susceptibility of the nodes to get infected. For example, 
a newly arrived node is more susceptible to be infected as 
opposed to a node that has been present in the network 
for long. Such a node might have possibly developed the 
necessary immunity to counter the infection. Our results 
show that, if a network is known to have evolved in steps, 
then its chronology can be effectively excavated. 
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Figure 19: The plot denotes the BQM score for var- 
ious binning methodologies for the reference graph 
Gm of 1000 nodes and 3 connections. In our ex- 
periment, we have set a — 50. The results we ob- 
tained are as follows: 'nDCR.ig.r^ = 0.84654821986 

r^degree = 0.7697124538121 r]het weenness = 0.753169421166 

rye^^en = 6899122714632 number of bins=77 
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